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AMENDMENT 



Assistant Commissioner for Patents 
Washington, D.C. 20231 

Dear Sir: 

In response to the Office Action dated April 28, 1999 (Paper No. 5) in the above-referenced 
patent application, please make the following amendments: 

IN THE SPECIFICATION: 

On page 4, line 14, please delete "chromatic" and substitute therefor --structural--. 
On page 15, line 12, please delete "8x" and substitute therefor ~x~. 
On page 15, line 14, please delete "Ay" and substitute therefor ~8y~. 

IN THE CLAIMS: 

Please amend the following claims: 

1 . (Amended) A computerized method of extracting a key frame from a video, comprising 
[the steps of]: 

a) providing a reference frame; 
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b) providing a current frame different from the reference frame; 

c) determining a chromatic difference measure between the reference frame and the 
current frame; 

d) determining a structure difference measure between the reference frame and the 
current frame; and 

e) identifying the current frame as a key frame if the chromatic difference measure 
exceeds a chromatic threshold and the structure difference measure exceeds a structure 
threshold. 

2. (Amended) The method defined in Claim 1, additionally comprising [the step of] setting 
the current frame to be the reference frame if a key frame is identified. 

3. (Amended) The method defined in Claim 1, additionally comprising [the step of] 
repeating [steps c-e] c)-e) for a new current frame until the end of the video is reached. 

7. (Amended) The method defined in Claim 1, wherein [the step of] determining the 
structure difference measure is performed only if the chromatic difference measure exceeds the 
chromatic threshold. 

8. (Amended) A computerized method of extracting a key frame from a video having a 
plurality of frames, the method comprising [the steps of]: 

a) providing a reference frame; 

b) providing a current frame different from the reference frame; 

c) determining a first difference measure between the reference frame and the 
current frame; 

d) deteraiining a second difference measure between the reference frame and the 
current frame; and 

e) identifying the current frame as a key frame if the first difference measure 
exceeds a first threshold and the second difference measure exceeds a second threshold. 
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9. (Amended) The method defined in Claim 8, additionally comprising [the step of] setting 
the current frame to be the reference frame if a key frame is identified. 

11. (Amended) The method defined in Claim 8, additionally comprising [the step of] 
repeating [steps c-e] c)-e) for a new current frame until the end of the video is reached. 

14. (Amended) The method defined in Claim 8, wherein [the step of] determining the 
second difference measure is performed only if the first difference measure exceeds the first 
threshold. 

17. (Amended) The method defined in Claim 8, additionally comprising [the step of] 
determining a third difference measure between the reference frame and the current frame, and 
wherein the identifying [step] identifies the current frame as the key frame if the third difference 
measure exceeds a third threshold. 

18. (Amended) A computerized method of extracting a key frame from a video having a 
plurality of frames, the method comprising [the steps of]: 

a) providing a reference frame; 

b) providing a current frame different from the reference frame; 

c) determining a structure difference measure between the reference frame and the 
current frame; and 

d) identifying the current frame as a key frame if the structure difference measure 
exceeds a structure threshold. 

19. (Amended) The method defined in Claim 18, additionally comprising [the step of] 
setting the current frame to be the reference frame if a key frame is identified. 



20. (Amended) The method defined in Claim 18, additionally comprising [the step of] 
repeating [steps c and d] c) and d) for a new current frame until the end of the video is reached. 
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REMARKS 

Applicant amends the specification and Claims 1, 2, 3, 7, 8, 9, 1 1, 14, 17, 18, 19, 20 by this 
paper. Claims 4-6, 10, 12-13, 15-16 and 21-22 remain unchanged and are presented for 
examination. Reconsideration and allowance of all Claims 1-22 in light of the present remarks is 
respectfully requested. 

Applicant has corrected clerical errors on pages 4 and 15 of the specification, and has made 
clarifying amendments to the claims by removing the term "steps" to avoid any triggering of the 
application of §1 12, If 6 to these method claims. 

Discussion of the Claim Rejection under 35 U.S.C. § 102(e) 

Claims 1-22 were rejected under 35 U.S.C. § 102(e) as being anticipated by Zhang et al. 
("Zhang"), U.S. Patent No. 5,635,982. The Zhang patent reference describes three algorithms or 
methods: (1) a method of segmenting a video sequence of frames into individual camera shots by 
determining segment boundaries (shot segmentation); (2) a method of automatically selecting 
threshold values for use in determining segment boundaries, particularly the shot break threshold 
and the transition break threshold; and (3) a method of key frame selection from a sequence of 
frames. The first algorithm is described at Column 4, line 62 to Column 6, line 63; the second 
algorithm is described at Column 6, line 65 to Column 7, line 28; and the third algorithm is 
described at Column 7, lines 30-62. The key frame selection algorithm solves an entirely different 
problem than the segmentation algorithm. 

Applicant describes the prior techniques, including Zhang, at page 3 of the specification: 

Most existing techniques have focused on detecting abrupt and gradual scene 
transitions in video. However, the more essential problem to be solved is deriving 
an adequate visual representation of the visual content of the video. 

Most of the existing scene transition detection techniques, including 
Shahraray and Zhang et al., use the following measurements for gradual and abrupt 
scene transitions: 1) Intensity based difference measurements wherein the 
difference between two frames from the video which are separated by some time 
interval "T", is extracted. Typically, the difference measures include pixel 
difference measures, gray level global histogram measures, local pixel and 
histogram difference measures, color histogram measures, and so forth. 2) 
Thresholding of difference measurements wherein the difference measures are 
thresholded using either a single threshold or multiple thresholds. 

However, to generate an adequate visual representation of the visual content 
of the video, a system is needed wherein the efficacy of the existing techniques is 
not critically dependent on the threshold or decision criteria used to declare a scene 
-4- 
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break or scene transition. Using existing techniques, a low value of the threshold 
would result in a oversampled representation of the video, whereas, a higher value 
would result in the loss of information. What is needed is a system wherein the 
choice of the decision criteria is a non-critical factor. 

In particular, the Zhang patent describes use of several difference metrics at Column 3, 
line 18 to Column 4, line 18: pair- wise pixel comparison, likelihood ratio, histogram comparison, 
and x 2 test. The first two metrics, pair-wise pixel comparison and likelihood ratio, utilize the 
intensity values of the pixels in successive frames. The histogram comparison "is less sensitive 
to object motion, since it ignores the spatial changes in a frame." "The % 2 test is a modified 
version of the histogram comparison "which makes the histogram comparison reflect the 
difference between two frames more strongly." (emphasis added). 

Applicant claims a computerized method of extracting a key frame from a video. 

Applicant may utilize a two stage key frame extraction process, as claimed in Claim 1 : 

determining a chromatic difference measure between the reference frame and the 
current frame; determining a structural difference measure between the reference 
frame and the current frame; and identifying the current frame as a key frame if 
the chromatic difference measure exceeds a chromatic threshold and the structure 
difference measure exceeds a structure threshold. 

In this process, determining the structure difference measure may be performed only if the 

chromatic difference measure exceeds the chromatic threshold. The chromatic measurements 

filter the video based on the brightness and color differences between frames, for example. The 

structural difference measurement compares images based on the structure or edge content of the 

image. Zhang does not utilize structural difference measurements. In fact, Zhang teaches away 

from structural difference measurements at Column 7, lines 52-61: 

The key frame extraction method as described in FIG. 4 is different from prior art. 
Prior art used motion analysis which depends heavily on tracing the positions and 
sizes of the objects being investigated using mathematical functions to extract a 
key frame. This method is not only too slow but also impractical ... the present 
invention extracts key frames purely based on the temporal variation of the video 
content as described in FIGS. 4 and 4A. (emphasis added) 

Therefore, since the above excerpt differentiates the temporal variation from tracking "the 
positions and sizes of objects being investigated" in the frames, the temporal variation is not 
referring to structural differences, but rather, must be referring to the only difference metrics 
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described in the patent. These are the difference metrics described in Columns 3-4, which 
measure intensity differences. For example, recall that Column 4 recites a description of one of 
the difference metrics as follows: "Histogram comparison ... ignores the spatial changes in a 
frame." 

The combination of measuring two or more orthogonal image features in a hierarchical 
manner is also not shown in Zhang. As explained on page 2 of Applicant's specification, two 
frames may be compared based on several different sets of image properties, such as color 
properties, distribution of color over the image space, structural properties, and so forth. Since each 
image property represents only one aspect of the complete image, a system for generating an 
adequate representation by extracting orthogonal properties from the video is described by 
Applicant. 

While operating on a typical produced video, such as a television feed, the chromatic 
difference measurement may be tuned to pick up frames during gradual transitions, such as fades, 
dissolves, wipes and so forth. These frames are typically chromatically different but structurally 
similar. The redundancy in the output of the chromatic difference based measurement is filtered 
out by the structural difference measurement to then yield the actual keyframes. For example, 
frames in a fade have the same structure, but are significantly different chromatically due to the 
fading effect. 

Thus, the combination of measuring two or more orthogonal image features in a 
hierarchical manner provides a significant improvement in generating an adequate representation of 
the video while keeping the computational process simple and efficient. The first feature 
measurement (e.g., chromatic difference) is selected to be computationally cheaper than the second 
measure. The second feature measurement (e.g., structural difference) is a more discriminatory 
measurement that extracts more information from a frame than the first measure. Applicant's 
hierarchical method can be extended to "N" stages or measures. 

Applicant's definition for key frame extraction at Claims 6 and 13 includes the following: 
"the value of the first threshold and the value of the second threshold are each user-selectable." 
The Examiner stated that Zhang, at Column 7, lines 1-29, discloses that the first and second 
thresholds are user selectable. This cited passage describes automatically determining threshold 
values used for determining segment boundaries in the segmentation algorithm. The text 
describes how the shot break threshold is automatically computed based on statistics, and how 
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the transition break threshold is based on an equation that includes the computed shot break 
threshold. Thus, Zhang does not disclose a user-selectable chromatic threshold and a structure 
threshold for extracting a key frame. 

Applicant's definition for key frame extraction at Claims 7 and 14 includes the following: 
"the structure difference measure is performed only if the chromatic difference measure exceeds the 
chromatic threshold." The second of the two unique difference measurements is performed if the 
result of the first difference measurement exceeds its threshold. The Examiner stated that Zhang, 
at Column 6, lines 30-40, discloses that determining the structure difference measure is performed 
only if the chromatic difference measure exceeds the chromatic threshold. This cited passage 
describes a method of skipping a preselected number of frames (skip factor S) to determine 
potential segment boundaries in the segmentation algorithm, which has nothing to do with key 
frame selection. Thus, Zhang does not disclose performing the structure difference measure to 
extract key frames only if the chromatic difference measure exceeds the chromatic threshold. 

Applicant's definition for key frame extraction at Claim 10 includes the following: "the 
first difference measure is orthogonal to the second difference measure." The combination of 
measuring two or more orthogonal image features in a hierarchical manner is not disclosed in 
Zhang. Orthogonal image properties are not even discussed in Zhang. The Examiner apparently 
incorrectly grouped Claim 10 with the rejection of Claim 3, which concerns repeating for a new 
frame. 

Applicant's definition for key frame extraction at Claims 15 and 16 includes the 
following: "the second difference measure is computationally more expensive than the first 
difference measure" and "the second difference measure extracts more information than the first 
difference measure". The Examiner stated that Zhang, at Column 7, lines 1-60, discloses that the 
second difference measurement is more computationally intensive and extracts more information 
than the first difference measure. Lines 1-28 of the cited passage describe the second algorithm 
of the invention of automatically selecting threshold values for use in detenriining segment 
boundaries. The remainder of the cited text describes the third algorithm on key frame selection 
wherein a selected difference metric is used. The Zhang patent does not, however, describe that 
two unique metrics are used. Moreover, even if two unique metrics would be used in the key frame 
algorithm, the Zhang patent does not describe that the second metric is more computationally 
intensive and extracts more information than the first metric. 
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Applicant's definition for key frame extraction at Claim 17 includes the following: 
"determining a third difference measure between the reference frame and the current frame, and 
wherein the identifying identifies the current frame as the key frame if the third difference measure 
exceeds a third threshold." The Examiner stated that Zhang, at Column 3, lines 45-68, discloses 
using a third difference measure. This cited passage in Zhang does identify one of the difference 
metrics. However, the Zhang reference does not disclose using more than one difference measure 
for key frame selection. 

Applicant submits that Zhang is overcome as a reference for Claims 1, 8 and 18. Since 
Claims 2-7, 9-17 and 19-22 are dependent on independent Claims 1, 8 and 18, respectively, 
pursuant to 35 U.S.C. § 1 12, U4, they incorporate by reference all the limitations of the claim to 
which they refer. Therefore, the rejection of the dependent Claims 2-7, 9-17 and 19-22 has also 
been overcome. Therefore, in view of the above, it is submitted that Claims 1-22 are clearly 
distinguished from the cited art and are patentable. 



By this amendment, Applicant has amended the specification and claims. In view of the 
foregoing amendments and remarks, Applicant respectfully submits that Claims 1-22 of the above- 
identified application are in condition for allowance. However, if the Examiner finds any further 
impediment to allowing all claims that can be resolved by telephone, the Examiner is respectfully 
requested to call the undersigned. 



Conclusion 
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